PACKING ELLIPSOIDS WITH OVERLAP* 
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Abstract. The problem of packing ellipsoids of different sizes and shapes into an ellipsoidal 
container so as to minimize a measure of overlap between ellipsoids is considered. A bilcvcl opti- 
mization formulation is given, together with an algorithm for the general case and a simpler algorithm 
for the special case in which all ellipsoids are in fact spheres. Convergence results are proved and 
computational experience is described and illustrated. The motivating application — chromosome 
organization in the human cell nucleus — is discussed briefly, and some illustrative results are pre- 
sented. 
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1. Introduction. Shape packing problems have been a popular area of study in 
discrete mathematics over many years. Typically, such problems pose the question of 
how many uniform objects can be packed without overlap into a larger container, or 
into a space of infinite extent with maximum density. In ellipsoid packing problems, 
the smaller shapes are taken to be ellipsoids of known size and shape. In three 
dimensions, the ellipsoid packing problem has become well known in recent years, due 
in part to colorful experiments involving the packing of M&Ms |6] . 

Finding densest ellipsoid packings is a difficult computational problem. Most 
studies concentrate on the special case of sphere packings, with spheres of identical 
size. Here, optimal densities have been found for the infinite Euclidean space of 
dimensions two and three. In two dimensions, the densest circle packing is given by 
the hexagonal lattice (see [Ej), where each circle has six neighbors. The density of 
this packing (that is, the proportion of the space filled by the circles) is 7r/vT2. In 
dimension three, it has been proven recently by Hales |5] that the face-centered cubic 
(FCC) lattice achieves the densest packing. In this arrangement, every sphere has 
12 neighboring spheres and the density is tt/vTS. For dimensions higher than 3, the 
problem of finding the densest sphere packing is still open. 

A problem related to sphere packing is sphere covering. Here, the goal is to find 
an arrangement that covers the space with a set of uniform spheres, as economically 
as possible. Overlap is not only allowed in these arrangements, but inevitable. The 
density is defined similarly to sphere packing (that is, the total volume of the spheres 
divided by the volume covered), but now we are interested in finding an arrangement 
of minimal density. In two dimensions, as for circle packing, the optimal circle covering 
is given by the regular hexagonal arrangement. However, the thinnest sphere covering 
in dimension 3 is given not by the FCC lattice, but by the body-centered cubic (BCC) 
lattice. In this arrangement, every sphere intersects with fourteen neighboring spheres; 
see for example [Hj. 

In this paper we study a problem that falls between ellipsoid packing and covering. 
Given a set of ellipsoids of diverse size and shape, and a finite enclosing ellipsoid, we 
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seek an arrangement that minimizes some measure of total overlap between ellipsoid 
pairs. 

Our formulation is motivated by chromosome organization in human cell nuclei. 
In biological sciences, the study of chromosome arrangements and their functional 
implications is an area of great current interest. The territory occupied by each 
chromosome can be modeled as an ellipsoid, different chromosomes giving rise to 
ellipsoids of different size. The enclosing ellipsoid represents a cell nucleus, the size 
and shape of which differs across cell types. Overlap between chromosome territories 
has biological significance: It allows for interaction and co-regulation of different genes. 
Also of key significance are the DNA-free interchromatin channels that allow access 
by regulatory factors to chromosomes deep inside a cell nucleus. Smaller nuclei tend 
to have tighter packings, so that fewer channels are available, and the chromosomes 
packed closest to the center may not be accessible to regulatory factors. 

The arrangement of chromosome territories is neither completely random nor 
deterministic. Certain features of the arrangement are believed to be conserved during 
evolution [15] , but can change during such processes as cell differentiation and cancer 
development [11] . In general, smaller and more gene-dense chromosomes are believed 
to be found closer to the center of the nucleus [1], and heterologous chromosomes 
tend to be nearer to each other than homologous pairs [5]. For further background 
on chromosome arrangement properties, see [51 [TS] . 

A major goal of this paper is to determine whether the experimental observations 
made to date about chromosome organization can be explained in terms of simple geo- 
metrical principles, such as minimal overlap. The minimum-overlap principle appears 
to be consistent with the tendency of chromosome territories to exploit the whole 
volume of the nucleus, to make the DNA-free channels as extensive as possible. Our 
formulation also includes features to discourage close proximity of homologous pairs. 

The remainder of the paper is organized as follows. In Section [2| we outline 
the mathematical formulation, define notation, and state a key technical result con- 
cerning algebraic formulations of ellipsoidal containment. In Section [3| we study the 
special case of finding a minimal overlap configuration of spheres inside an ellipsoidal 
container. We describe a simple iterative procedure based on convex linearized ap- 
proximations that produces convergence to stationary points of the minimal-overlap 
problem. We show through simulations that our algorithm can be used to recover 
known optimal circle and sphere packings. In Section [4| we generalize our optimiza- 
tion procedure to ellipsoid packing, introducing trust-region stabilization and proving 
convergence results. Section [5] describes the application of our algorithms to chromo- 
some arrangement. 

Notation. When A and B are two symmetric matrices, the relation A ^ B 
indicates that B — Aia positive semidefinite, while A ^ B denotes positive definiteness 
of B — A. Similar definitions apply for ^ and 

Let X be a finite-dimensional vector space over the reals R endowed with inner 
product (•, •). (The usual Euclidean space R" with inner product {x,y) :~ x^y and 
the space of symmetric matrices iSR"^" with inner product {X,Y) :— trace(Xy) 
are two examples of particular interest in this paper.) Given a closed convex subset 
VL (Z X, the normal cone to at a point x is defined as 

Na{x) -.^{v e X\{v,y~'x) <Q for all y e n}. (1.1) 

We use df to denote the Clarke subdifferential of the function f : X ^ R. In 
defining this quantity, we follow Borwein and Lewis [5J p. 124] by assuming Lipschitz 
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continuity of / at a;, and defining the Clarke directional derivative as follows: 

I{y + th)-f{y) 



r{x-h) :-- 



lim sup 



t 



The Clarke subdifferential is then 

df{x) ;= {t; e X I (v, h) < f°{x; h), for ah h e X}. 



(1.2) 



When / is convex (in addition to Lipschitz continuous), this definition coincides with 
the usual subdifferential from convex analysis, which is 

df{x) -.^ {v eX \ f{y) > f{x) + {v,y- x) for all y G dom /} 

(see H Proposition 2.2.7]). 

2. Problem Description and Preliminaries. An ellipsoid f C R" can be 

specified in terms of its center c € R" and a symmetric positive definite eccentricity 
matrix 5 e S'R" 



We can write 

£ := {x e R" I (a; ~ cfS-^{x - c) < 1} = 

It is often convenient to work with the quantity S 
definite), and thus to rewrite the definition (2.1) as 

£ {x e R" I {x - cfY.-\x - c) < 1}. 



{c + 5u I ||u||2 < 1}. (2.1) 
:= S"^ (also symmetric positive 

(2.2) 



For the remainder of this section, we assume that rt = 3, that is, the ellipsoids are 
three-dimensional. The eigenvalues of S are the lengths of the principal semi-axes of 
f ; we denote these by ri, r2, and ra, and assume that these three positive quantities 
are arranged in nonincreasing order. It follows that the eigenvalues of E are r^, 
and r|, and that the matrices S and S have the form 



'2) 



S = Q 


















Q 







for some orthogonal matrix Q, which determines the orientation of the ellipse. 

In this paper, we are given the semi-axis lengths rn, ri2, and for a collection 
of N ellipsoids £i, i — 1,2, . . . , N. The goal is to specify centers Ci and matrices Si 
for these ellipsoids, such that 

(a) £i <Z £, for some fixed ellipsoidal container £; 

(b) The eigenvalues of Si are rn, ri2, and r^a, for i = 1, 2, . . . , A^; 

(c) Some measure of volumes of the pairwise overlaps £i\^£j, i,j = 1,2, . . . , N, 
i ^ j, is minimized. 

In the following subsections, we give more specific formulations of (c), first for the 
case in which all £i are spheres (that is, r^i — — r^a, i = 1,2,..., and then 
for the general case. For now, we note that a crucial element in formulating these 
problems is ellipsoidal containment, that is, algebraic conditions that ensure that one 
given ellipsoid is contained in another. This is the subject of the following lemma, 
which is a simple application of the S-procedure (see [3l Appendix B.2]). 
Lemma 2.1. Define two ellipsoids as follows: 



£ = {xeR^ 
£^{xeR^ 



{x- 
{x- 



cfS-^{x- 
cfS-'ix^ 



c) < 1} = {c + Su 
c) < 1} = {c + Su 



M2<1}, 
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The containment condition £ <Z £ can be represented as the following linear matrix 
inequality (LMI) in parameters c, S, c, and : There exists A G R such that 

-XI s 

A-1 (c-c)'^ 1 ^0. (2.3) 
S c-c -S^ 

Proof. The condition £ C £ can be expressed as 

{c + Su- c)'^S^'^{c + Su - c) < 1 for all u such that |lu||2 < 1. 

By multiplying out this inequality we get 

u'^SS-^Su + 2u^SS'^ (c - c) + (c - cYS-^{c - c) - 1 < 

for all u such that u^u — 1 < 0. By applying the S-procedure, we find that this is 
equivalent to the existence of A > such that 

SS-^S SS-^c-c) ^\ .2 41 

This expression is not linear in the variables c, S, c, and S"^, but an elementary 



Schur complement argument shows equivalence to the linear matrix inequality (2.3), 
completing the proof. □ 

As one special case, the condition £i d where £i is a sphere with center Ci and 
radius and £ is an ellipsoid centered at with matrix S, can be represented as the 
LMI: 



(2.5) 



The more general case of f ^ C £, where £i is an ellipsoid with center Ci and matrix 
Si and £ is an ellipsoid centered at with matrix S, can be represented as the LMI: 





(2.6) 



3. Sphere Packing. We give a problem formulation for the case in which all 
enclosed shapes are spheres (of arbitrary dimension), and present a successive approx- 
imation algorithm that is shown to accumulate or converge to a stationary point of 
the formulation. Some examples of results obtained with this approach are described 
at the end of the section. 

3.1. Formulation and Algorithm. When the inscribed objects are spheres, 
the variables in the problem are the centers Ci € i?™, i = 1,2, . . . , N , which we 
aggregate as follows: 

c:={ci,C2,...,cn)- (3.1) 

The radii r^, i — 1,2, . . . , N are given. We express the containment condition for each 
sphere as follows: 



£i C £ <J4> Ci G fti, 
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(3.2) 



where is a closed, bounded, convex set with nonempty interior. When £ is a sphere 
of radius R centered at 0, we have fli :— {c, : ||ci|| < R — r.^}. Otherwise, we can 



define fii imphcitly by Lemma 2.1 see in particular (2.5) 



A simple measure for the overlap between two spheres £i and £j is the diameter 
of the largest sphere inscribed into the intersection, which we denote by an auxiliary 
variable ^i^: 



^ij := max(0, {n +rj)-\\i 



ij >l<i<j<N- 



Our minimum-overlap problem can thus be formulated as follows: 



mm 



subject to 



ioi l<i <j < N 



for i^l,...,N, 



(3.3) 



(3.4a) 

(3.4b) 
(3.4c) 
(3.4d) 



where (3.4c) denotes the entrywise condition S^ij >0, l<i<j<N. The objective 



H : p"<." i)/2 _^ satisfies the following assumption. 



R' 



,n(«-l)/2 



R+ is convex and continuous, 



Assumption 1. The function H 
with the following additional properties: 

(a) H{0) = 0; 

(b) H{k > whenever ^ ^ 0; 

(c) 0<^<^^H{0<H{0- 

Assumption [l] is satisfied, for example, by the norms H{() — ||^||i, H{^) — ||^||2, 
and H(^) = ||^||oo = niaxi<i<j<Ar ICijl- In the application to be discussed below, we 
prefer the overlaps in the overlapping ellipsoids to be roughly the same size; for this 
purpose, the £2 and £00 norms are the most appropriate. 



Although the objective (3.4a) and containment constraints (3.4d) are convex, the 



problem (3.4) is nonconvex, due to the constraints (3.4b). A point is Clarke-stationary 
for (3.4) if the following conditions are satisfied, for some Ay eR, l<i<j<N: 



< g^J - Kj ± 6j > for some g,, G d^^M{0, l<t<j <N, (3.5a) 

N i-1 

>^ijWij -^XjiWj^ e Nq^{ci), i = 1,2, . . . ,N, (3.5b) 

3=i+l 3 = 1 

< 6j + Cjll - (r, + r^-) _L A.^ > 0, l<i<j<N, (3.5c) 
c, - c. 



where llwiJU < 1, with Wi 



C.7II2 



when a Cj, I < i < j < N. (3.5d) 



Condition (|3.5d) defines Wij to be in the subdifferential of 



1 2 with respect to 



See (1.1) for the definition of the normal cone in (3.5b) 



We now develop an algorithm that seeks a local solution of (3.4 1, by formulat 



ing a sequence of convex approximations in which the key feature is linearization of 



the nonconvex constraint (3.4b) around the current iterate. Because of the special 



properties of this problem, we need not apply the usual safeguards for this successive 
approximation approach, such as trust regions or line searches. Decrease of the objec- 
tive at each iteration and accumulation of the iteration sequence at first-order points 



of the problem (3.4) can be proved in the absence of these features. However, for pur- 



poses of stabilizing the iterates generated by the method, it may be desirable to place 
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a uniform bound on the length of each step. This can be done without comphcating 
the analysis, and we do so in our implementations. 



The linearization of (|3.4| around the current iterate c is defined as follows: 

(3.6a) 

subject to 



P{c-) :=miniJ(^) 



Ci € rii, for i = 1, . . . ,N, 



where Zi 







for 1 < i < j < iV, 



when c, 7^ c, 
otherwise. 



(3.6b) 
(3.6c) 
(3.6d) 

(3.6e) 



This problem is convex, with affine constraints except for the inclusion (3.6d), which 



can be satisfied strictly when each fli is closed, bounded, and convex, with nonempty 
interior. Hence (see for example jT3l Theorem 28.2, Corollary 28.3.1]), its solutions 
are characterized by the following KKT conditions: There exist A^-, 1 < i < j < N 
such that 



< gij - Xtj J- > for some .g,y e d^^M{0, 

N i-1 

y ] ^ijZij — ^ ] XjiZji G -^Oi(Ci), 
j=i+l j = l 

< 6j + zfj{ci - Cj) - (r,; + rj) _L Xij > 0, 



l<i<j<N, (3.7a) 
i = l,2,...,N, (3.7b) 
l<i<j<N. (3.7c) 



We can use a compactness argument to verify that solutions to (3.6 1 are attained. The 



vector of feasible centers c is restricted to a compact set, by the assumed properties of 
f2i, • ■ • , ^N- By using (3.6b) we can define effective upper bounds on the variables 
^ij as follows: 



^'ij := max 0, sup (r,; + r^) - zfjici - Cj) 



(For any feasible c, and given any ^ satisfying (3.6b I, we can always replace ^ by an 



alternative feasible point ^" £ [0, C] without increasing the value of H, by property (b) 
of Assumption [1]) Thus, the problem (3.6) reduces to minimization of a continuous 
convex function over a compact set, for which existence of a solution is guaranteed. 



Algorithm 1 Packing Spheres by Minimizing Overlap 



Given Vi, i ^ 1,2, . . . , N and 51^ closed, convex, bounded with nonempty interior; 
Choose c° e fii X X • • • X rijv; 
for A; = 0,1,2,... do 



Solve P(c'=) defined by ^ to obtain (c'=+\ ^'=+1); 
if ^(C'^+i) = iJ(C'=) then 

stop and return c*^; 
end if 



end for 



Sete*;+'=max(0,(r, + r,)-||cf 



c^- ' *|1 j for 1 < i < j < N; 
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Algorithm [T] is the simple algorithm based on the subproblcm ( |3.6[ ). To analyze 
convergence properties of this method, we start with basic results about stationary 
points and about the changes in H at each iteration of Algorithm [T] 

Lemma 3.1. Suppose that the sets ^li in (g.^P are closed, bounded, and convex, 
with a nonempty interior, and that Assumption M holds. Then the following claims 
are true. 

(i) If the point (c^, sa tisfies the optimality conditions (3.7) for the subpro blem 



for the problem l3.4\ 



P{c ) defined by (3.6), then (c , ^ ) satisfies the stationarity conditions (3.5) 



(ii) If th e point (c , ^'^') satisfies the stationarity conditions (3.5) for the problem 
{ 3.4) and in addition =^ Cj for all 1 < i < j < N, then (c*, ^'^) satisfies the 



3.6) 



optimality conditions (3.1) for the subproblem P{c*') d efin ed by 
(Hi) If {c^,^,^) does not satisfy the stationarity conditions (3.5), then II{£^^^^) < 

Hie). 

(iv) For each k we have 11(e) ^ H{e)- 
Proof. 

(i) If (c'',e) satisfies the optimality conditions (3.71 for P{c^), then by setting 
Wij = Zij in (3.5 1, we see that these conditions are also satisfied with the 



same values of gij and Xij. (We have made the particular choice Wij = 
when — c*^.) 

e) with ^ c^ for all i, j with 
for all such i, j. Thus by 



(ii) If the conditions (3.5 1 are satisfied at (c*^, 
l<i < j < N, then = (cf - c^)/||cf 



noting that z.^ — Wij for all such i, j, we can verify using the same values of 



(iii) 



gij and Xij that {c'',e) satisfies the optimality conditions (3.7), and therefore 
is a solution of P{c'^). 

Note that the point (c*^,^'') is feasible for the subproblem (3.6) (with = 
c'^), so its optimal objective satisfies 7J(p+^) < 11(e)- Since (c'',e) does 



not satisfy the stationa rity conditions ( |3.5[ ), however, part (i) implies that it 
cannot be a solution of (3.6), which implies that in fact iJ(^'"'+^) < 11(e), ^ 
claimed. 



(iv) By using the fact that 
I < i < j < N that 



2 < 1, we have for all fc > 1 and all i, j with 



e. = max(rj 



||2,0) < max(rj +r. 



-iz1-Yic'(-c';),o)<e 



The result now follows immediately from Assumption [TJc). □ 
Note that in the case of coinciding centers, i.e. c, = Cj for some i ^ j, the 



stationarity conditions for (3.4 1 and (3.6) are not equivalent. This observation yields 
the intriguing property — unusual in algorithms based on linear approximations — 
that Algorithm [l] may be able to move away from a stationarity point for (3.4). That 



is, if (c'^, e) satisfies ( 3.5 1 but there is some pair (i, j) with i ^ j and c^ = Cj, then by 
setting Zij = 0, the subproblem (3.7) may yield a solution (c''+^, f''+^) with H(e~^^) < 



H(e), and thus (by Lemma 3.1 "(iv)) the next iterate will satisfy ff(^'=+i) < H(e). 



ith 



< 1 when Ci = 



Note too that the proof of Lemma 3.1 (iv) still holds if zf is chosen to be any vector 



Hence, random choices for z*; in this situation could 



be used in place of our choice Zij — above, leading to some interesting algorithmic 
possibilities for avoiding coincident centers and moving away from stationary points. 
Since coincident centers rarely arise in the cases of interest, however, we do not pursue 
these possibilities. 
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(a) Global Solution: o = (b) Local Solution: o = .5 (c) Local Solution: o = .5 

.4122147478 



Fig. 3.1. Solutions obtained by Algorithm^ for packing circles of radius .5 into a circle of 
radius 1, showing final overlap measures for each. 



We now prove the main convergence result for Algorithm [T] 
Theorem 3.2. Suppose that the sets Q 



bounded, and convex, 
Then Algorithm [7] either 



3.4 ) are closed, 
with a nonempty interior, and that Ass umption \1 

terminates at a stationary point for (3.4), or else generates an infinite sequence {c*^} 
for which all accumulation points c are either stationary points for ( 3.4 ), or else have 
Ci ~ Cj for some p air ( i, j) with 1 < i < j < N . 



Proof. Lemma 3.1 (iii) says that termination can occur only if {c^, ^'^) satisfies the 
stationarity conditions ( |3.5[ ). Hence, we need to consider only the case of an infinite 
sequence of iterates {c'^}. Suppose for contradiction that there is an accumulation 
point c for this sequence such that Ci 7^ Cj for all {i,j) but c is not stationary for 



(3.4) 



Considering the problem P{c) defined by (3.6), we have by Lemma 3.1 (iii) 
:— H{^) — -ff(C) > (strict inequality), where — max(0,rj + r j — \\ci — Cj\\). 
Moreover, we can identify a neighborhood Af oi c such that for all c'^ e Af, we have 



that 



(r+ ) < H{C+') < Hit) - e/2, 



(3.8) 

This claim follows from Lemma [3.1| (iv) and the observation that the optimal objective 
in (3.6) is a continuous function of c~, for c~ near c. The face that Ci 7^ Cj for all (i, j) 
ensures that the za are continuous functions of c~, while H itself is continuou s by 



Assumption [T] Since there is a subsequence S with limfeg^ c*^ = c, we have from (3.81 



and monotonicity of the full sequence {i?(? )} that ) | —00. This is impossible, 
however, since H is bounded below by 0. We conclude therefore that all accumulation 
points c are either stationary or else have Ci = Cj for some pair {i,j), as required. □ 

As noted above, the case in which accumulation points have coincident centers 
is exceptional, so Theorem |3.2| shows that the algorithm usually either terminates or 
accumulates at stationary points. 

3.2. Examples. We present several examples showing results obtained with Al- 
gorithm [1] on various problems, and compare them with known results. To begin, 
a simple example to demonstrate the existence of local minima that are not global 
minima. 

Example 3.1 (Five Circles). Consider the problem of packing five circles of 
radius .5 into an enclosing circle of radius 1. Results obtained with Algorithm^ 
with objective H{^) — ||^||oo7 from random starting points reveal an apparent global 
solution (Figure 3.1(a)) and a family of local solutions (Figures 3.1(b) and 3.1(c)\ l. 



Fig. 3.2. Circle packing in a circular enclosure. A nearly hexagonal arrangement is seen in 
the interior. 



The local solutions are characterized by one of the packed circles having its center at 
the center of the enclosing circle; this circle thus has an overlap of .5 with all four 
of the outer circles. The outer circles in this local solution need only be arranged so 
that their maximum pairwise overlap is no greater than .5. Algorithm^ required only 
a few iterations for each of these examples. 

As noted in Section [l] optimal sphere packings (configurations with no overlap) 
have been obtained in two and three dimensions, for spaces of infinite extent. Our 
algorithm can only solve problems with finite enclosing shapes, but we can use large 
enclosures to investigate how similar the local solutions attained by our algorithm are 
to the known optimal packings in (hexagonal lattice with density tt/VTS) and R*^ 
(FCC lattice with density n/VlS). 

Example 3.2 (Uniform Circles in R^). We ran Algorithm^ with N — 150 
circles, each of area tt, and a circular container of size 150\/T2. This results in a 
total circle area-to- container area ratio which is equal to the optimal packing density. 
The resulting circle configuration is shown in Figure \37^ The hexagonal arrangement 
of the circles is clearly visible in the interior of the container. 

We also ran tests in which 100 circles are packed into a square container. (Rect- 
angular feasible sets fli are easily incorporated into the formulation by defining bound 
constraints on the centers Ci.) We generate starting points by arranging the centers 
in a 10 X 10 square lattice. We may then add a random perturbation to each cen- 
ter. Results are shown in Figure \37^ (For clarity, we show only the centers in this 
figure, omitting the circles.) When no perturbations are added to the starting config- 
uration, the algorithm does not move from the initial square configuration shown in 



(a) o = .1192514295 (b) o = .1188906843 (c) o = .1181440939 (d) o = .1179656050 



Fig. 3.3. Local minima obtained by Algorithm[l\for packing circles into a square, showing final 
overlap measures for each. 
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Figure 3.3(a). When random initial perturbations are applied (large enough that the 
original square grid structure is not recognizable in the initial point), many different 
local minima are obtained. Three of these are shown in Figures 3.3(b)\ 3.3(c). and 



3.3(d)\ Note that all of these have a maximum overlap less than the square configu 



ration, and that hexagonal structure is recognizable in large parts of the domain, with 
square structure and disorder in intermediate regions. 

Example 3.3 (Uniform spheres in R^). We performed a similar test to Exam- 
ple \3.^ in three dimensions. We checked to see whether Algorithm [7] converges to 
a solution like the FCC lattice in a finite minimum- overlap arrangement with 200 
spheres enclosed in a larger sphere. We chose the small spheres to have volume n and 
the containing sphere to have volume 200-\/l8, giving a density o/tt/vTS, identical to 
the FCC lattice, which is optimal in infinite .space. At the solution obtained by Algo- 
rithm [i| we counted the number of spheres that touch or intersect each sphere. This 
statistic provides an indication of the type of packing attained, since the FCC lattice 
has 12 neighbors per sphere, while the BCC lattice has only 8 neighbors per sphere. 
The histogram for the number of neighboring spheres is shown in Figure 3.4(a)\ A 



more instructive diagram is obtained by removing from consideration those spheres 
that touch the enclosing sphere. After doing so, we obtain the histogram in Figure 



3.4(b) This figure suggests strongly that the calculated solution is close to the FCC 
lattice over most of the interior region of the domain. 

Finally, we report on solutions obtained by Algorithm [T] on packings of discs in 
a circle, for which the optimal packing is known in only a few cases. In particular, 
we analyze packings with equally sized discs, where the number of discs is given by a 
hexagonal number, that is, 

/i(fc) = 3fc(fc + 1) + 1, k>l. (3.9) 

Example 3.4. Lubachevsky and Graham llOjj introduce curved hexagonal pack- 
ings, a new family of packings for configurations with a hexagonal number of discs. 



This family contains the best packings found so far for h{k) defined by [3.9), fork < 5. 
We ran Algorithm^ with the optimal densities found in JlOf for k ~ 3,4, 5. The best 
local optima we found are shown in Figure \375[ they are identical to the configurations 
found in /_? OJ . (We highlight some of the circles to emphasize the "curved" feature of 
the packing, which distinguishes it from a standard hexagonal arrangement, which has 
slightly lower density when restricted to a finite circle.) 



Number of neighbors 

(a) Distribution of neighbor counts. 



Number of neighbors 



(b) Distribution of neighbor counts, af- 
ter spheres at the periphery have been re- 
moved. 



Fig. 3.4. Neighbor counts for packing of 100 three-dimensional spheres in in a spherical container. 
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(a) 37 discs (b) 61 discs (c) 91 discs (d) 37 discs, larger 

radii 



Fig. 3.5. Optimal configurations found using A Igorithm [7] /or hexagonal numbers of unit discs 
in an enclosing circle. 



When we ran Algorithm [7] on the problem of 37 uniform discs in a larger disc, 
where the radii were too large to allow packing without overlap, the algorithm with 



^(C) = llClloo produced the same arrangement of centers as in Figure 3.5(a) (see 



Figure \3.5(d) ) when initialized at a sufficiently close initial point. It is a well known 
property of minimization of the norm \\ ■ ||oo that many elements of the argument 
vector tend to achieve the maximum value. In our application, this means that the 
maximal overlap is attained by many pairs of circles. We can obtain non- overlapping 
configurations by simply reducing the radii of all discs uniformly, by an amount equal 
to half the maximal overlap. This will yield a solution in which each pair of circles 
that formerly overlapped maximally now just touches. 

4. Ellipsoid Packing. Here we discuss a bilevel optimization procedure for 
packing ellipsoids into an ellipsoidal container in a way that minimizes the maxi- 
mum overlap of any pair of ellipsoids. It is not as obvious how to measure the overlap 
between two ellipsoids as between two spheres, since it depends on the orientation 
of the ellipsoids as well as the location of their centers. We measure the overlap by 
the sum of principal semi-axes of the largest ellipsoid that can be inscribed in the 
intersection of the two ellipsoids. This overlap measure can be calculated by solving 
a small semidefinite optimization problem, constructed according to the S-procedure 



(see Subsection 4.1). These are the lower- level problems in our bilevel optimization 



formulation. The upper-level problem is to position and orient the ellipsoids so as 



to minimize the maximum overlap (see Subsection 4.2), while keeping all ellipsoids 
inside the enclosing shape. We refer to this problem as "min-max-overlap." Dual 
information from the lower-level problems provides a measure of sensitivity of the 
overlaps to the ellipsoid parameters, allowing us to develop a successive approxima- 
tion approach, with trust regions, whose accumulation points are stationary for the 
min-max-ovcrlap problem. Technical results regarding the trust-region approach and 



the proof of convergence are given in Subsection 4.3 



4.1. Measuring Overlap. Boyd and Vandenberghe [31 Section 8.4.2] consider 
the problem of finding the ellipsoid of largest volume inscribed in an intersection of 
ellipsoids. The volume of an ellipsoid £ — {c + Su \ \\u\\2 < 1} is proportional to 
det(S'). Although this problem is convex, it is not a semidefinite program (SDP), 
because the objective is nonlinear. We thus consider an alternative in which trace(5') 
is used as the objective. The trace is the sum of lengths of the semi-axes of the 
ellipsoid, which is a good proxy for the volume in problems of the type we consider. 
Trace maximization admits an SDP formulation of the lower-level problems, which 
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facilitates theoretical development and analysis of our min-max-overlap problem. 
Recalling from (2.1) that we define the ellipsoid £i by 



£^■■^{CH + S,u\\\u\\2<l}, (4.1) 

we introduce the notation — Sf . Parametrizing the inscribed ellipsoid similarly by 
£ij := {cij + SijU I ||m||2 < 1}, and using (2.3 1 to formulate the fact that the inscribed 
ellipsoid is contained in both £j and £j , we formulate the problem of measuring overlap 
as follows: 

0(ci, Cj , Si, Sj) := max trace(S'ij) 

Sij^O.Cij ,Xiji,Xij2 

subject to I Ajji - 1 {cij - CiY I ^ 0, (4.2a) 



—Kjil 







Sij 





Xtji - 


- 1 


{Cij C 


Sij 


Cij — 


Ci 




— Kj2l 







Sij 







- 1 


{Cij — c 


Sij 


Cij — 







The Lagrangian can be written as 

£(c, Sij, Xiji, Xij2, Tij, Miji, Mij2) '■—{!, Sij) + {Tij, Sij) 

'\jil Sij 

(M„-i, ( Ayi-1 (c„ -c,)^|) 

'^ij2^ Sij 

(My2, ( Ay2-1 (C„ -Cj)^|), 

Si/i Cii C-j Xj-y 



-•3 



with the dual problem being derived from 

»f wn»r wn-r Ic wn i ^C,j , Sij , Xijl, Xij2,T,j , M^ji, Mij2) 

Introducing the following notation for Miji and Mij2'. 

(Riji r^ji Pi]i\ /Rij2 rij2 Pij2\ 

rfji Pi]i qTji , M,j2 ■■= rf^2 Pij2 qfj2 , (4-3) 
Pijl Qijl QijlJ 9ij2 Qij2 / 

we can write the dual explicitly as follows: 

6(ci, Cj , S„ Y.j) := min p^ji + p.^a + '^qJ^iCi + 2(7,^2Ci 

+ {Qijl, Si) + {Qij2, Sj) 



subject to = / + T,j - 2Pyi - 2Py2 (4.4) 
= trace(i?iji) - p^^i 
= trace(i?ij2) - Pij2 
= (7iji + qij2- 
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(We have assumed without loss of generality that Piji and Pij2 are in S'R"^"; this 
follows from 5*,^ e SR"""".) 

When the ellipsoids £i and £j overlap, strong duality holds for this primal-dual 
pair of semidefinite programs since, as we now verify, both problems have a strictly 
feasible point. For (4.2) we know that there exists an ellipsoid with positive volume 



that is strictly inscribed in the intersection. By setting Cij and Sij to be the parameters 
of this ellipsoid (with Sij >- 0), the S-proccdure for strict inequalities can be applied 
to show that (strict) definiteness holds in (4.2a) and (4.2b). This fact establishes 



strict feasibility of (4.2). For the dual (4.4), we can set Tij ~ I and define 



ijl 



M,j2 = 




It is easy to verify that these choices satisfy the constraints in (4.4), along with the 
(strict) inferiority conditions M^i >- 0, Mij2 >- 0, Tij >~ 0. This observation of 
strong duality justifies our use of the notation 0{ci, Cj, S^, Sj) to denote the optimal 
objectives of both primal and dual. 

4.2. Choosing Ellipse Positions and Orientations. The main variables in 
the min-max-overlap problem are the parameters defining the ellipses £i for i = 
1,2, ...,Ai": the centers Ci and the orientations defined by Si (and thus = 5*^^). 
For n = 3 (which we assume in this section and subsequently), we would like to fix 
the lengths of the axes of each ellipsoid at the values r^i, ri2, and (assuming that 



fii > ri2 > Tis). This is equivalent to fixing the eigenvalues of at r^j^. 



and 



i3' 



or to fixing the eigenvalues of Si to r. 



1^12, 



and 



Using the notation O defined in (4.2) and (4.4 1, we can formulate the min-max 



overlap problem as the following bilevel optimization problem: 



min £ 

i,{ci,Si,Si),i=l,2,...,N 



(4.5a) 



subject to ^ > 0(q, Cj, E,, Ej), 1 <i < j < N, (4.5b) 

Si as, i = l,2,...,N, (4.5c) 

E, = Sf, (4.5d) 

semi-axes of £i have lengths r^i, ri2, ri^, i = 1, 2, . . . , iV. (4.5e) 

This problem is nonconvex for three reasons. First, each pairwise overlap objective 
0(ci, Cj, Si, Ej) is a nonconvex function of its arguments. This issue is intrinsic; 
as in the sphere-packing problem, we expect there to be many local solutions and 
we can only expect our algorithm to find one of them. As we see below (in (4.101 



and Algorithm 4.4), our algorithm iteratively solves subproblems in which each O is 



replaced by a linearized approximation that makes use of the optimal dual variables 
Miji and Mij2 from the formulation (4.4). These subproblems will be convex if we 



can overcome the other two sources of nonconvexity in the formulation (4.5), which 
we address now. 



The second nonconvexity issue is in the constraint (4.5e) on the eigenvalues of Si, 
i = 1,2, . . . , N. We consider instead the following convex relaxation: 

S^-nlIdlO, Si-nsIhO, tr&ce{S,) = r.a + r,2 + n3. (4.6) 

Note that this is indeed a relaxation: If £i has the desired dimensions, then the 



eigenvalues of Si are rn , ri2 , and r^a , and the conditions ( 4.6 ) are satisfied. Because the 
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overall goal is to minimize maximal overlap, and because minimum- volume ellipsoids 



are those that are most eccentric, the relaxation (4.6) is usually "tight" with respect 



to (4.5) in many interesting cases. Intermediate iterates are often observed to have 
ellipsoids less eccentric than desired. 

The third source of nonconvexity — the constraint (4.5d) — is relatively easy to 



deal with. We replace it with the following pair of restrictions: 

^0, S,hO, i = l,2,..., 



Si Si 
S, I 



N. 



(4.7) 



The first of these conditions ensures only that ^ Sf. However, the overlap 
0{ci, Cj, Si, Ej) will grow if is replaced by any matrix E^ ^ E.^. Hence, because of 



the min-max-overlap objective in (4.5), the matrices E^ will be set to the "smallest 
possible values" that satisfy the conditions (4.7), that is, E^ = Sf. 



Finally, defining the containing ellipse to be £ := {x \ {x — c)^E ^{x — c) < 1}, 



we can use (2.3) to formulate the condition (4.5c) as follows 



-X^I 5, 
A, - 1 (c, - c)^ 
Sj Cj c S 



^ 0, 



(4.8) 



Given all these considerations, we define the relaxed version of (4.5) to be ad- 
dressed in this section as follows: 



min f 

^,(\i,Ci,SiSi),i=l,2,...,N 



(4.9a) 



^ > 0(c, 


, Ei , 








l<i<j<N, 


(4.9b) 


"-A,/ 




















A,-l 


{Ci 


-cf 




z = l,2,. 


.,N, 


(4.9c) 


. 


Ci - c 




~E 










Ej Si 
5, / 










i = 1,2,. 


.,N, 


(4.9d) 


5. - ^ 0, 


S^- 


nsl h 0, 


z = l,2,. 


.,N, 


(4.9e) 


trace (5'i) 


= ni ^ 


f ri2 


+ ri3, 




i = 1,2,. 


.,N. 


(4.9f) 



Note that when the ellipse £i is actually a circle, that is, vn = ri2 — r^s, we can fix 
Si = rul and E^ = r^^I in (4.10), and eliminate these variables from that problem. 
Hence, we can assume without loss of generality that rn > ri^. 

In the remainder of this subsection, we describe our algorithm for solving the 
bilevel optimization problem (4.9), and prove convergence properties. Our develop- 
ment and analysis takes place in a general setting that encompasses (4.9) but uses 
simpler notation. A key step in the algorithm is the solution of a subproblem for (4.9 ) 
in which the objective is linearized using the optimal values from the dual overlap 
formulation (4.4). The other constraints in (4.9) remain the same, and a trust region 
is added to restrict the amount by which the ellipsoid parameters can change. This 
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subproblem can be stated as follows: 



min £ 

f,(A,,Ci,S.,Si),i=l,2,...,Ar 



subject to ^ > piji + pij2 + "IqljiCi + 2qJj2Cj 

+ {Qljli ^i) + {Ql]2, Sj), 



(4.10a) 



for(i,j)el, (4.10b) 





















A, - 1 (q - c)^ 




i = 


1,2,. 


.,iV, 


(4.10c) 




Cj - c -E 












Si 
I 






i — 


1,2,. 


.,N, 


(4.10d) 


- r,J ^0, nsl h 0, 


i = 


1,2,. 


.,iV, 


(4.10e) 


trace(5i) 


= ni + + r^s, 




i = 


1,2,. 


.,iV, 


(4.10f) 


\\Cr-C~\ 


2 A 2 
2 — ^c: 




i = 


1,2,. 


.,7V, 


(4.10g) 


\\S^-Sr 


II < As, 




i = 


1,2,. 


.,N, 


(4.IOI1) 


|A^-A-| 


< Aa, 




i — 


1,2,. 


.,7V. 


(4.10i) 



Here, (A~, c~, 5'~, E~) are the values of the variables at the current iteration, and Ac, 
A5, and A\ are trust-region radii. The quantities Piji,Pij2, liji, <lij2, Qiji, and Qij2 
are extracted from the dual solutions Miji and Mij2 of (4.4) according to the structure 



(4.3). The set X represents a subset of all possible pairs (i,j) for 1 < i < j < TV, 
representing some selection of ellipses which currently have a (strict) overlap. Further 
details on the choice of I are given in Subsection |4.4[ 

We claim first that, if it is possible to fit each ellipsoid St strictly inside the 



enclosing ellipsoid £, the subproblem (4.10) satisfies a Slater condition. That is, there 



exists a point that satisfies the linear equality constraints and strictly satisfies the 
inequality constraints in this problem. To justify this claim, we first show that it is 
possible to find a point (A^, c,;, Si,Yii) that satisfies the following conditions: 



"-A,/ 

A, - 1 (c, - c)^ 

Si Ci c E 


^0, 


i = l,2,. 


.,7V, 


(4.11a) 




Ei Si 
St I 




i-1,2,. 


.,7V, 


(4.11b) 


St - Till -< 0, 


Si - r^I >- 0, 


i = l,2,. 


.,7V, 


(4.11c) 


trace(5i) — rn + 


ri2 + 




i = l,2,. 


.,7V. 


(4.11d) 



First, choosing q = c in (4.11a), and orienting ellipsoid £i appropriately, we can find 
Ai > such that (4.11a) is satisfied. This remains true if we perturb Si slightly 
so that its spectrum lies in the open interval (ris^rn) while still satisfying the trace 
condition (4. lid). This perturbed Si thus satisfies the conditions (4.11c ). Second, we 



can simply define E^ — ail for large enough tTi > to ensure that (4.11b) is satisfied 
strictly. 

Next, note that from the current iteration, we have a point (A~, c~, S^ , E^) that 
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satisfies the constraints of (4.9 1, that is, 





s- 



s- 

{c7-cr 


^0, 


i = l,2,. 


.,7V, 


(4.12a) 


"s- Si- 
Si I 




i = l,2,. 


.,7V, 


(4.12b) 


Si - ral h 0, 


i = 1,2, . 


.,7V, 


(4.12c) 




1 


i = l,2,. 


.,7V. 


(4.12d) 



Si - r,a/ ^ 
trace(5j^) = 

It is now easy to check that for sufficiently smaU e > 0, the point 

{\^{<^),c^{e),S^{t),Y.i{e)) := (1 - e)(A,^, c,^, S*,^, E^) + e(A,, c,, 5i, S^) 



satisfies the inequahties ( 4.10c[ ), ( 4.10d[ ), and (4.10e) strictly, satisfies the linear con- 
straint (4.10f), and satisfies the trust-region constraints (4.10g), (4.10h), and (4.10i| 



strictly. Since we can choose ^ arbitrarily large to strictly satisfy (4.10b), we conclude 



that there exists a Slater point for (4.10) 



4.3. Technical Results. We prove here some technical results that provide the 
basis for convergence of the trust-region strategy. To simplify the notation, we note 



that each dual overlap problem (4.4) has the general form 

s.t. {Ai^„Mi) = hi^„ I = 1,2,..., pi, MihO. 



t\{C) :=min {C,Mi) 



(4.13a) 
(4.13b) 



Here C captures the parameters that describe all the ellipses, and Mi is the dual 
variable for the overlap problem. We assume that C lies in a set Vl of the following 
form: 



^lf^{C : {Bk,C)^bk, fc = l,2,...,p}, (4.14) 
where 17 is closed, convex, bounded, with nonempty interior. We now verify formally 



that the ellipse parameters satisfying the constraints in (4.9) can be expressed in the 



form (|4.14|) . We define C to be a block-diagonal matrix with 7V blocks of the form: 

(4.15) 



1 



1,2, 



,7V, 



and define f2 to be the set of all symmetric matrices of this form for which there 



exist Xi and Si such that each tuple {\i,Ci, Si,Yii) satisfies the conditions (4.9c) 



(4.9d), and (4.9e). Boundedness of Ci is obvious from the containment condition 



Ei C E; boundedness of Si follows from (4.9e|; while (4.9c) implies that e [0,1]. 
Boundedness of E^ is not guaranteed by the constraints in (4.9). We could, however, 
add the constraint E^ < r^^I without changing the solution of the problem, thus 
completing the verification of boundedness of Cl. (For simplicity, however, we do not 
put this explicit bound on E^ in our discussion below.) Closedness and convexity are 



immediate consequence of the form of the constraints (4.9c), (4.9d), and ( 4.9e ). To 
verify nonemptiness of the interior of Cl, recall the discussion following (4.9), where 
we noted that variable Si can be eliminated from the formulation if ellipsoid Ei is in 
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fact a circle. Thus, we can assume without loss of generality that rn > r^a for all 



and hence, from the discussion surrounding (4.11 ), we conclude that the set of tuples 
{Xi,Ci, Si,Yji) allowed by constraints (4.9c), (4.9d), and (4.9e| has nonempty interior. 



The structural features of C (the diagonal element 1 in (4.151 and the off-diagonal 
zeros) can in principle be enforced by affine constraints of the form given in (4.14). 



The constraints (4.9f) can also be enforced in this way. 



Following the notation of (4.11 ), we denote by C the point that satisfies 



Ceintf7 and (B^, C) = 6^, fc 1, 2, , 



(4.16) 



We denote by Mi{C) an optimal value of Mi in ( 4.13|) ( not necessarily unique). 
The primal form (4.2) of the overlap problem (4.13) has the form 



max 

Ci = (Ci,l iC!,2^-- 



,C!,p, ) 



Cl S.t. C-J2Cl,^^U^^■ 



(4.17) 



As discussed in Subsection 4.1 both (4.131 and (4.17) have strictly feasible points 



when there is positive overlap between two ellipsoids. Therefore, strong duality holds. 



so the following optimality conditions relating the solutions of (4.131 and (4.171 are 
satisfied: 



1=1 



(4.18a) 
(4.18b) 



By convention, we set i;*(C) = — oo if (4.13) is infeasible, that is, if there is no 
overlap between the two ellipses corresponding to index I. By the nature of the 
problem, we know that ti (C) > if these two ellipses have positive overlap. It is 
easy to see that t;*(C) is a concave, extended- valued function of C & fl, and as a 
consequence that t^(C) is continuous on the relative interior of its domain. Further 
useful facts about i/*(C) are given in Lemma A. 3 These include Lipschitz continuity 
in a neighborhood of a point C at which (4.17) has a strictly interior point (which, 
as noted in Subsection |4.1[ occurs when the two ellipsoids have positive overlap), and 
the fact that any solution Mi{C) of (4.13) belongs to the Clarke subdifferential of 

Using the notation of (4.13) and (4.17) to capture the min-max-overlap problem 



(4.5), we can state this problem as follows: 



min t*(C) := max tUC). 

Ceo i=l,2,...,m 



(4.19) 



Here each element in {1,2,..., m} represents the overlap problem for a given pair of 
ellipsoids. Note that t*{C) — — oo if no pair of ellipsoids overlaps or touches. 

We now define the subproblems to be solved at each iteration of the algorithm, 
which depend on just a subset J- C {1, 2, . . . , m} of the individual overlap problems. 
The key quantity is defined as follows 



max tUC), 



(4.20) 



where is a subset of the strictly overlapping ellipsoid pairs, that is. 



J"C {/ = 1,2, .. .,m : t*i{C) > 0}. 
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(We will be more specific about the definition of T later.) In the algorithm, the 



solutions Mi(C) of (4.131 for I £ T are used to construct a linearized subproblem 
whose solution is a step AC in the parameter C, assuming that the current C is 
feasible. The subproblem is as follows: 

L{T,G,M^{C),p): r{F,C,MAC),p) -.^ lam r (4.21a) 

r, AC 

s.t. r > i;(C) + (AC,A//,(C)), l£T, (4.21b) 
C + ACef7, ||AC||<p. (4.21c) 

Here p > is a trust-region radius, and Mjr{C) denotes the set of matrices {Mi{C) : 



I G J^}. The problem (4.211 is convex, and its feasible set is bounded, so it has an 
optimal value which we denote by AC(p). Further, the KKT conditions are satisfied 
at this point. This claim follows from the fact that, given the point C satisfying 



(4.16), and defining AC = e(C — C) for some small positive e > 0, we have that 

C + AC = (1 - e)C + eC e int n, 
while the trust-region condition is strictly satisfied (||e(C — C)|| < p), and the remain- 



ing constraints in (4.21) are affine. Hence, the conditions of [131 Theorem 28.2] are 
satisfied, and we can apply il3j Corollary 28.3.1] to deduce that there exist jii,! £ F 
and r > such that 

l-^A'i=0, (4.22a) 

< Ai/ ± r{F,C,M^{C),p) - t*i{C) - (AC,M,(C)) > 0, l^F, (4.22b) 
-^piMi{C) ~Tue NniC + AC) for some u e a||AC||, (4.22c) 

C + ACen, (4.22d) 
< r _L p- ||AC|1 > 0. (4.22e) 

Here Nfi{C) denotes the normal cone to the closed convex set ft at the point C (see 
(1.1)) and d denotes a subdifferential. (As noted in Section [l] since || • || is convex 



and Lipschitz continuous, the Clarke subdifferential coincides with the subdifferential 
from convex analysis.) Note that the set {tv : t > 0, v E 9||AC||} is a closed convex 
cone and that it is an outer semicontinuous set-valued function of AC. 

It is sometimes convenient to rewrite L{J^, C, Mjr(C), p) by defining the function 



G^(AC; C, M^(C)) ■.= um^{C + AC, Mi (C)) , (4.23) 



and writing 



L{T,C\M^{C),p) : min C^(AC;C,M^(C)) s.t. C + AC G 17, ||AC|| < p. 

(4.24) 

Note that Gjr{-\ C, Mjr{C)) is convex, in fact piecewise linear. 

Next, we define the reference problem P{J-) that minimizes f^iC) defined in 



(4.20) over C e Q: 



P(T): := min tV(C) = min max t,*(C). (4.25) 

cen cen ler 
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Nonsmooth analysis provides the following result that characterizes solutions of (4.25 ). 



Lemma 4.1. Suppose that for a given set J- d {1,2,.. 



m}, C is a local solution 
of (4--25) at which (4-11) has a strictly interior point, for all I G J- . Define f{C) 
to be the set of indices achieving the maximum in {4-25), that is, J'{C) — {I ^ J- : 
tii^) — tj^}- Then there exist Mi £ dt*i(C) and scalars fii, for all I G J-{C), such 
that 



l^iMieNniC), 



J2 = 



fii>0, le T{C), C en. (4.26) 



Proof. We appeal to results about the Clarke subdifferential applied to max- 
functions and sums of functions. First, note that the strict inferiority assumption 
means that we can apply Lemma A. 3 (iv) to deduce that each i^* is Lipschitz continuous 
in a neighborhood of C. Hence, applying Theorem 6.1.5], we have that 

dt*^{C) C conv{dt;{C) : I e ^(C*)}, (4.27) 

where conv(-) denotes the convex hull. The Corollary in ^ p. 52] can be used to show 
that when is a solution of (4.251, we have 

Oedt*^{C) + Nn{C). 



The result follows by combining this expression with (4.27). □ 

By introducing multipliers for the indices I E J'\ J-{C), we can restate the con- 
ditions (4.26) as follows: 



0< t^i ±t*^~tuc)>o, 

51 Mi = 1, 



for all I G 



(4.28a) 
(4.28b) 



Cen. 



(4.28c) 
(4.28d) 

We say that a point C at which these conditions are satisfied is Clarke- stationary for 
P{T) defined in (|L25]). 

For purposes of our main technical lemma, we define the "predicted decrease" 
from subproblem L{F ,C ,Mjr{C), p) as follows: 



K{F, C, M^{C),p) :- t*^{C) - r{T, C, M^C), p). 



(4.29) 



Note that since AC = is feasible for ( |4.21[ ), we have A( J", C, Mjr{C),p) > 0. 
Lemma 4.2. Let J" C {1, 2, ... , mj be given. 

(i) Suppose that C is such that (^.i?) has a strictly feasible point for all I G T . 
If K{T , C, Mjr{C), p) — for some p > and some set of solutions Mi{C) to 
{4.13) for I G T , then C is Clarke- stationary for P{T). 
(ii) K{!F ,C, Mjr{C), p) is an increasing function of p> 0. 
(Hi) K{!F ,C,Mjr{C), p) / p is a decreasing function of p> 0. 

(iv) For all C in some neighorhood of C defined in (i), we have that t*jr{C + 
AC(p)) < r{T,C,Mr{C),p) for any p>Q. 
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Proof, (i ) If r {T,C,Mjr{C),p) = t*jr for some p > 0, then AC = achieves the 
minimum in (4.21), for C = C. Hence, there exist fj,i, I (z J- such that the optimality 
conditions ( |4.22[ ) are satisfied with AC = and t — 0. Thus, conditions (4.28) are 
satisfied with Mi = Mi{C) and the same vahies of /x;, I E 

(ii) Trivial, as the size of the feasible region for L{T, C, Mjr^C), p) increases with p. 

(iii) We need to show that for pi and p2 with < pi < p2, we have 

A(J-,C,M^(C),pi) ^ k{T,C,MAC),p2) 



Pi 



P2 



Using the formulation (4.24) of L(T, C, Mjr^C), p), and in particular the convex func- 
tion Gjr{-; C,Mjr{C)) defined in ( |4.23[ ), we have that 

C^(0;C,Af^(C)) = um^{C,Mi{C)) = i^(C). 

Given the solution AC(p2) of L[F, C, M^(C), P2), note that ^AC(p2) is feasible for 
L{T,C,Mjr{C),pi). Since AC(pi) is optimal for C, m/(C), pi), we have 

' Pi 



G^{ACipi);C,M^iC)) < C^ ^AC(p2); C, M^(C) 

P2 



< (1 - ^ ) C^(0; C, MAC)) + ^C^(AC(p2); C, M^(C)). 



P2/ 



P2 



The result follows by rearrangement of this expression, since 

A(^, C, M^(C), pi) = C^(0; C, M^(C)) - G^(AC(pi); C, Af^(C)), 
A(^, C, Af^(C), P2) = G^(0; C, M^(C)) - G^(AC(p2); C, Af^(C)). 



(iv) The result follows immediately from Lemma A. 3 (iv), when we use the definition 



(4.20) and the fact that 



riT,C\MAC),p) 



max tt (C) 



{AC{p),Mi{C)). 



We show now that all accumulation points of a sequence {Cfc} for which 

A(^,Cfc,Ai-^(Cfe),l)^0 

are Clarke-stationary for P{J-)- 

Theorem 4.3. Suppose that for a given set J- C {1, 2 , . . . ,m}, {Cfc} is a sequence 
of matrices in fl converging to a limit C such that (4-11) has a strictly feasible point 
for each I G J- . Suppose further that K{J- ,Ck, Mjr[Ck), 1) — > 0. Then C is Clarke- 
stationary for P{J-)- 

Proof. We invoke Theorem |A.2| to deduce that for all / G the solution sets of 
P{l,Ck) in (4.13) are uniformly bounded for all k sufficiently large. Hence, we can 



identify subs eque nces of {Mi(Ck)} for each I G T that approach some limit Mi, where 
(ii). Ml is a solution of P{l,C) for each I G J^. We can thus write 



A.2 



by Theorem 

Mi{C) = Ml for each I G J-. By defining Mjr[Ck) and Mjr(C) in obvious ways, and 
taking a subsequence, we have that Mjr[Ck) Mjr[C). 
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We show next, by contradiction, that A(J^, C, Mjr(C), 1) = 0. If this claim is not 
true, there exists AC" such that 



IIAC'II < 1, C + AC"el7, G^(AC";C,M^(C)) < t>(C) -e, 
for some e > 0. Defining 



AC^ :=C-Cfe + AC', 



C, IIACfcll < 1, and Cfe + AC', - C- 



we have from Ck 

feasible for L{T,Ck, Mjr{Ck),2). Hence, invoking Lemma 4.2 (iii), we have 



AC" e n that AC^ is 



lim G^(AC^; Cfe, M^(C,)) > lim t^C'k) ~ A{T, Cfe, M^(Cfc), 2) 

A; A; 

> limt^(Cfc) - 2A(J-, Cfc, Af^(Cfc), 1) 

k 



The final limit above follows from the definition of i^. Lemma A. 3 (iv), and the 
assumption that A(J^, Ck, Mjr{Ck), 1) — » 0. On the other hand, we have from Ck + 
AC; = C + AC, the definition of Gjr (|4.23[), and the limit Mjr{Ck) Mjr{C) that 



lim G^(AC^; A/^(Cfc)) - lim max {Ck + AC^MiiCk)) 

k k l^J- 

= lim m^ (C + AC, Mi{Ck)) 
^mSixlC + AC',Mi(C)) 

where e > is defined above. This yields the contradiction, so we conclude that 
K{J^ ,C , Mjr{C ,1) = 0, as claimed. Clarke stationarity of C for P{F) now follows 
from Lemma [4. 2| (i). □ 

4.4. Algorithm. We now define the algorithm for solving the problem P defined 
by (4.19). Note that in this general setting, t;*(C) defined by (4.13) is continuous on 



the set 



:={C : t*i{C)>-^}, 



which is closed and convex. We make additional assumptions about the nature of 



the solutions to the parametrized primal-dual pair (4.13), (4.171, that do not hold in 



general, but which are satisfied for the application we consider here. 
Assumption 2. 

(i) tr(c)>-oo ^ n(c)>Q. 



(a) Ift^{C) > 0, then the dual {4-1'^) has a strict feasible point. 

It is an immediate consequence of Assumption [2] and Lemma fA.SI that all points C 
on the boundary of ^E"; have (C) ~ 0. We also have the following uniform continuity 
result. 

Lemma 4.4. Suppose that t;*(C) is defined by (4--13), that Assumption^holds, 
and that ^l has the form (4-. 14-)- Let t > be given. Then for any e > 0, there is 
S > such that for all C £ fl, all C £ with ||C — C|| < 6, and all I = 1,2, ... ,m, 
the following conditions hold: 

(i) Ift*{C) > I then t*i{C) > t;{C) - e; 
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(it) t*i (C) < max(0, t*i (C)) + e. 

Proof. Note first that since Vt is compact, the set ^i{t) := {C € | t*i{C) > t} 
is also compact, for any I £ {1, 2, . . . , m} and any t > 0. Since t^(-) is continuous at 
every point of this set, under the stated assumptions, it is uniformly continuous on 
this set. Thus for any e > 0, there is a value S = Si{e) > such that (i) holds. Thus 
it is sufhcient for (i) to define S to be min/=i^2....,m '5;(e). 

For (ii), we suppose for contradiction that for some e > 0, there is no 5 > with 
the property claimed. Thus, for any sequence {Sr} with dr i 0, we can find Cr G fi, 
Cr E ^ with \\Cr ~ Cr\\ < Sr, and I E {1,2, ... , m} such that 



t^Cr) > max(0,t,*(a)) 



(4.30) 



By taking a subsequence if necessary, we can assume that this inequality holds for some 
fixed I E {1,2,..., to}. Since all Cr belong to the compact set f2 H {C \ t1{C) > e}, 
we can assume (by taking another subsequence if necessary) that Cr — C, for some 
C with t;*(C') > e. It follows that Cr G also, so using continuity of t\ and taking 



limits in both sides of (4.30), we obtain 



t\{C)= lim t\[Cr)> lim max(0,t;(a)) + e = niax(0,i;(C')) + e > ^^(C') + e, 



a contradiction. □ 

A key issue in implementing the algorithm is to decide which subset T of the 
overlapping pairs to use in calculating the step AC in (4.21). Clearly, T should 



include the indices I for which the overlaps between the corresponding ellipsoid pairs 
are at or near the maximum. It could also include other indices with positive (but 
smaller) overlap. Clearly, it cannot contain any non-overlapping ellipsoids, as the 
problem P{l,C) (4.13) has no solution in this case, so Mi{C) is not defined. We 
settle on the following requirement, which depends on parameters rji, r/2 E (0, 1) with 



< 77i < 7^2 < 1: Given Ck for which t*{Ck) > (see definition ( |4.19 )), we choose 
to satisfy: 



{I ■■ t*i{Ck) > V2t*{Ck)} C C {I : t^iCk) > mt*{Ck)}- 



(4.31) 



Algorithm |4.4| describes our method. It follows a standard trust-region framework, 
though its analysis is a little non-standard. At each iteration, we calculate a candidate 
step ACfc by solving the linearized subproblem ( 4.21[ ) with trust-region radius pk, 
and calculate the predicted reduction A{J^i;,Ck, Mjr^{Ck), Pk) (4.29) expected from 



this step. If the actual objective achieves at least a fraction ci of this decrease (for 
ci E (0, 1)), we accept the step. If in fact the improvement is at least a larger fraction 
C2 of the expected decrease, we may increase the trust-region radius for the next 
iteration. Otherwise, we do not take the step, but rather shrink the trust-region 
radius and proceed to the next iteration. 

We now show that when the values t*{Ck) are bounded away from zero, there is 
a positive threshold such that any step ACfc with norm smaller than this threshold 
will be accepted. 

Lemma 4.5. Suppose that Assumption^holds and let i > be given. Let Ck he 
any iterate with t*{Ck) > t such that Ck is not Clarke- stationary for the problem P 
defined in 1^4.19 ), and suppose that J^k satisfies (4.31). Then there exists a threshold 
value Pi > (independent of Ck) such that 



t*{Ck + ^C{p)) < t*{Ck)-HTk,Ck,M^^{Ck),p) < t*{Ck) (4.32) 
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Algorithm 2 Packing Ellipsoids by Minimizing Overlap 

Given fl C S'R"^" compact; G (0, 1); Ci and C2 with < ci < C2 < 1; 4>i and 02 
with < 01 < 1 < (/>2; and pmax > 0; 
Choose Co e fi, po € (0, Pmax]; 
for fc = 0,1,2,... do 
Define J^k as in (4.31); 



Solve L(J-fc,Cfe,M^,(Cfe),pfc) (4.21) to obtain AC^; 



Compute predicted decrease A{J^k,Ck, Mjr^{Ck), Pk) from (4.29); 
if ACk = or t*{Ck + ACk) = then 

stop; 
end if 

if r{Ck + ACk) <t*{Ck) - c^M,Fk,Ck,M^^{Ck),Pk) then 
Cfe+i Cfc + ACfc; 

if r(Cfc + ACk) < t*{Ck) ~ C2K{Tk.Ck.Mr^{Ck).Pk) then 

Pk+l ^ niin((?!)2Pfe,/9inax); 

end if 

else 

Ck+i Ck] 

Pk+i ^ 4'iPk] 

end if 
end for 



whenever ||AC(p)|| is a solution of L{Tk,Ck, Mjr^{Ck), p) with p G (0,pt]. 

Proof. Note first that if Ck were Clarke-stationary for P{Tk), given that J^k 
contains all the indices I for which ti (Ck) attains the maximum t* (Ck), we would have 
that Ck is also Clarke-stationary for P, which we have assumed is not the case. From 
Assumption [2] and Lemma [4. 2| (i) we have therefore that A{J^k,Ck, Mjr^{Ck), p) > 
for aU p > and all solutions Mi{Ck) to ( |4.13[ ) with C = Ck and I £ Fk- 

Now define e — t[l — ?72)/2, and let pt be the corresponding (positive) value of 
5 from Lemma 4.4 Consider any AC such that ||AC|| < pf. For indices I such that 
t1{Ck) — t*{Ck) > t, we have from Lemma 



4.4 



(i) that 



tUCk + AC) > t*{Ck) - til - m)/2 > r (Cfe)(l + r,2)/2 
For indices I ^ Tk, we have tliCk) < Ti2't*{Ck), and so from Lemma 



4.4 



it follows that 



t*i{Ck+AC) < max(0,t;(Cfc))+i(l-772)/2 < mt*{Ck)+t{l^V2)/2 < t* (Cfc)(l+r,2)/2. 

Hence, for ||AC|| < pf, the index Zfor which ^(Cfe -|- AC) — ti{Ck + AC) comes from 
the set J-k, that is, 

t*^^{Ck + AC)^t*{Ck + AC). 

So choosing p e (0, pt] and setting AC(p) to be the solution of L{J^k,Ck, Mj^^ (Ck), p), 
we have from Lemma 4.2 (iv) that 

r(Cfc + AC(p)) = t>JC, + AC(p)) 

<t*^JCk)-A{Tk,Ck,M^,iCk),p) 

<tF,{Ck) 

= t*{Ck), 
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as claimed. □ 

The inequality (4.32) satisfies the step acceptance conditions in Algorithm 4.4 
since < ci < C2 < 1. It follows immediately that for any Ck with t*{Ck) > 0, the 
algorithm cannot "get stuck" by performing infinitely many unsuccessful iterations — 
eventually it will decrease p to the point where the step acceptance condition holds. 

We now prove the main convergence result. 

Theorem 4.6. Suppose that Assumption^ holds. Then either Algorithm \4-4\ 
terminates finitely, or else it generates an infinite sequence of iterates {C'k} for which 
accumulation points exist, and all accumulation points are Clarke- stationary points of 
P. When t*{Ck) i 0, all accumulation points are in fact global solutions of P. 

Proof. The finite termination cases are obvious, so we focus on the case of an infi- 
nite sequence {Ck}- Since all iterates are confined to the compact set f2, accumulation 
points of sequence {C^} exist. Note that the sequence of function values {t*{Ck)} is 
decreasing. The final statement of the theorem is self-evident, as this case indicates 
convergence to points at which there are no overlaps between ellipsoids. Hence, we 
focus on the case in which there exists f > such that t*{Ck) > i for all k. From 
Lemma [4?5l we see that at each iteration fc, the trust-region radius pk will generate a 
successful step AC^ whenever it falls below pf. Hence, since the algorithm decreases 
p by a factor of after each unsuccessful step, we have that 

Pk > min(po,</'iPt), for aU k. (4.33) 

In considering accumulation points of the sequence {Ck} we can remove all re- 
peated entries from the sequence. These repeats arise from unsuccessful steps (for 
which the acceptance condition was not satisfied), and the accumulation points of the 
sequence are the same whether the repeated entries are present or not. Note that 
there must be infinitely many successful steps since, as we note in the comment after 
Lemma |4.5[ the algorithm must eventually move away from any non-stationary point 
Ck with t*{Ck) > 0. We denote the subsequence of successful iterates by S. 

At a successful iteration fc e 5, we have 

t*{Ck+i)^t*{Ck + ACk) 

< t*{Ck) - ciA{Tk, Ck,M^,{Ck),Pk) 

< t*{Ck) - ci min(pfe, l)A{J'k,Ck,Mr,{Ck), 1) 

< t*{Ck) - ci min(po, 0iPt, l)A(J-fc, Ck,Mjr^ (Ck), 1), 

where we used Lemma |4.2| (ii) and (iii) to derive the second inequality, and the last 



inequality comes from (4.33). Since the sequence {t*{Ck)} is decreasing and bounded 
below (by t), we have < t*{Ck) — t*{Ck+i) i 0, so by rearranging and using the fact 
that min(po, 0iPt, 1) > 0, we have 

lim AiTk,Ck,M^,{Ck),l) = 0. (4.34) 

Now suppose that (7 £ is an accumulation point of the full sequence {Ck}. As 
noted above, it must also be an accumulation point of the "successful iterate" sequence 
{Ck}kes- So by taking a further subsequence S' C 5, we have lim^g^/ Ck ~ C. Since 
there is only a finite number of possibilities for the set Tk, we can take another 
subsequence S" C S' such that, in addition, Tk = jF for all k € S" . By the definition 



( |4.31[ ), we have that t*i{Ck) > mt*iCk) > Vit for all / e Tk and all k e 5". Thus, 
using continuity of t^ , we have that 

t^{C) >riit> 0, foralUeJ", 
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implying that P{1, C ) has a strictly feasible point for all I e Clarke stationarity of 
C now follows from (4.34 1, using the fact that S" C S and applying Theorem 4.3 □ 



5. Application: Chromosomal Arrangement in Human Cell Nuclei. We 

return to the application introduced in Section [l] that is, finding arrangements of 
chromosome territories in a cell nucleus on the basis of simple geometric principles 
(namely, low overlap and discouragement of proximity for homologous pairs) and 
seeing how closely the resulting arrangements match the experimental observations 
that have been made to date. 

During most of the cell cycle the chromosomes of higher eukaryotes are organized 
into distinct compartments known as chromosome territories (CTs). These domains 
have a roughly ellipsoidal shape and can overlap each other. This overlap is believed 
to have an important biological purpose, since it allows for the interaction and co- 
regulation of different genes. Additionally, the CTs tend to exploit the space available 
inside the cell nucleus, to allow for internal DNA-free channels, the interchromatin 
compartments. These compartments allow CTs deep inside the cell nucleus to be 
accessible for regulatory factors. 

As noted earlier, the arrangement of CTs is known to be non-random. Arrange- 
ments are known to be broadly conserved during evolution and are similar among cell 
types with similar developmental pathways. CT arrangements can also change during 
processes such as cancer development or cell differentiation. See [5] for more details 
and an overview about what is known about CT arrangements. 

There is strong evidence that chromosomes have a preferred radial position inside 
the nucleus. These preferences appear to be different for nuclei of different shapes, 
spherical and ellipsoidal. In ellipsoidal nuclei, CT size seems to drive the radial 
preferences, with the smaller CTs tending to lie nearer to the center. In spherical 
nuclei the situation is less clear, with the more gene-dense chromosomes seeming to 
lie nearer to the center of the nucleus. 

There is also evidence for neighbor preferences, which may play an important 
role in causing co-regulated genes in different chromosomes to be closer together. 
In particular, it has been observed recently that CTs tend to favor neighborhoods of 
heterologous chromosomes. This results in the two chromosomes in a homologous pair 
tending to be well separated. (In human cells there are 22 homologous pairs, each 
consisting of one chromosome from the mother and a similar one from the father.) 

We model a CT arrangement as a packing of overlapping ellipsoids of various 
sizes inside an ellipsoidal container, which represents the cell nucleus. Minimizing 
maximum overlap mimics the fact that the CTs exploit the space available in the 
nucleus, to allow for the presence of contiguous DNA-free interchromatin channels. 
These channels extend from the nuclear pores into the interior of the nucleus, mak- 
ing even the deepest CTs accessible from outside and connecting most chromosome 
territories. 

In this section we analyze whether purely geometric considerations, enforcing the 
simple principles of minimal overlap and well-separatedness of homologous pairs, can 
explain the observed arrangements of CTs in cell nuclei of different sizes and shapes. 

5.1. Human Cell Nucleus. The human cell nucleus has a volume of between 
500 jim? and 1600 /im^, depending on the cell size and stage of differentiation. The 
shape also differs according to cell type. Human fibroblasts, for example, have flat 
ellipsoidal nuclei, whereas lymphocytes have spherical cell nuclei. In this study we 
analyze three different nucleus sizes: small (500 /im'^), medium (1000 jim?) and large 
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(1600 fj,m^). For all three sizes we consider two shapes: spherical nuclei and flat 
ellipsoidal nuclei (with axis lengths in the ratio 1:2:4). 

We estimate the volume of each CT to be proportional to the number of base pairs 
in the chromosome, with the constant of proportionality determined by the average 
chromatin packing density. The number of base pairs for human cells ranges from 
47 Mbp (chromosome 21) to 247 Mbp (chromosome 1), while the human chromatin 
packing density in living cells has been estimated to be 0.15 fj,m^ /Mbp ([I2]). By 
multiplying the total number of base pairs by this average density, we arrive at a total 
volume of about 461 ^m^ over all CTs. The individual volumes for each chromosome 
territory are given in Table |5.1[ 

Table 5.1 

Volume of each chromosome territory based on chromatin packing density of 0.15 fim^ /Mbp. 



CT 


1 


2 


3 


4 


5 6 7 


8 


volume 


37.05 


36.45 


29.85 


28.65 


27.15 25.65 23.85 


21.90 


CT 


9 


10 


11 


12 


13 14 15 


16 


volume 


21.00 


20.25 


20.10 


19.80 


17.10 15.90 15.00 


13.35 


CT 


17 


18 


19 


20 


21 22 X Y 




volume 


11.85 


11.40 


9.45 


9.30 


7.05 7.50 23.25 8.70 





5.2. Implementation. The algorithm was implemented in Matlab and CVX [7]. 
Both, the pairwise-overlap problems and the master problem at each iterate of the 
algorithm were formulated in CVX. The algorithm is terminated when one of the 
following conditions holds. 

(i) The ratio of predicted decrease to trust-region radius falls below a specified 
tolerance. Using the notation of Algorithm |4.4[ we state this condition as 



K{Fk,Ck,M^,{Ck),Pk)/Pk < Toll, 

where we set Toll = .005 in our experiments. 

(ii) The maximum overlap falls below a small fraction Tol2 of the volume of the 
enclosing ellipsoid. We used Tol2 = .0001 in our experiments. 

(iii) The algorithm runs for 100 iterations. 

Many instances of the problem, including problems of different sizes and shapes, 
with different random starting points, were executed on servers running various ver- 
sions of Linux. 

5.3. Radial Preferences. In the first set of experiments, we use Algorithm |4.4| 
to arrange the CTs so as to minimize the maximal pairwise overlap, with overlap 



measured as in Section 4.1 (We enforce no constraints on homologous pairs in this 
first set.) We set up numerous trials with the data varied as follows. 

(i) CT volumes are obtained by sampling from a normal distribution, with mean 
taken from Table [5T] and the standard deviation set to .02 of the mean. 

(ii) The relative axis lengths are varied around the intercepts found in [9 for 
mouse chromosomes, namely, 1:2.9:4.4. The second and third ratios are sam- 
pled from a Gaussian distribution with mean values 2.9 and 4.4, respectively, 
and standard deviations of .1 times the mean. (The absolute axis lengths are 
then adjusted to match the volume chosen in (i).) 
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Table 5.2 

Statistics for final objective values attained in the six scenarios, shovjing number of trials, 
means, and standard deviations, both before and after the screening step. 







Before Screening 


After Screening 


shape 


vol (/xm^) 


trials 


mean 


sd 


trials 


mean 


sd 


spherical 


500 


100 


3.0889 


0.0533 


100 


3.0889 


0.0533 


elUpsoidal 


500 


100 


3.2769 


0.0660 


100 


3.2769 


0.0660 


spherical 


1000 


200 


1.8927 


0.4409 


195 


1.8233 


0.0657 


elHpsoidal 


1000 


200 


1.9723 


0.0714 


200 


1.9723 


0.0714 


spherical 


1600 


100 


0.6342 


1.4325 


89 


0.1349 


0.0362 


ellipsoidal 


1600 


100 


0.1338 


0.0291 


100 


0.1338 


0.0291 



We analyzed the radial preferences for two different nucleus shapes — spherical and 
flat ellipsoidal with axis ratios of 1:2:4 — and for the small, medium, and large nuclei 
with sizes described above. 

For each of these six different scenarios we ran 100-200 trials, each with data 
perturbed as described above and each from a different random starting point. We 
applied a screening step to remove those trials that have a final objective value greater 
than 

/* +max(0.5,min(0.2*/*,2.0)), 
where /* is the lowest objective value obtained over all trials for this scenario. Ta- 



ble 5.2 shows statistics on the final objective value for each of the six scenarios. Only 
a few trials were removed in the screening step, mostly for the large spherical nucleus 
in which the no-overlap solution was not quite attained. After screening, the final 
objective values were similar for all trials on a given scenario. 



Recall we use Algorithm 4.4 to solve the convex relaxation ( 4.9 ) of the original for 



niulation (4.5), in which the prescribed half-axis lengths (r^i, ri2, r^s) are replaced by 



the constraints (4.6). We found that a number of the ellipsoids were "more rounded" 
at the solution than our prescription would require, but that the deformation typically 
affected only a subset of the ellipsoids and was not too severe. By taking relative dif- 
ference in the ^2 norm between the vector of actual half-axis lengths and the prescribed 
values, we found that on small nuclei, an average of 8 of the 46 CTs experienced a 
relative change of greater than 10%. For medium spherical nuclei, about 7 out of 46 
changed by more than 10% while the corresponding number for large spherical nuclei 
is 11 out of 46. The statistics for ellipsoidal nuclei are slightly smaller, about 5 for 
small and large nuclei and 3 for medium nuclei. 

We analyzed the solutions generated in the trials remaining after the screening 



step to find the distances of each ellipsoid from the center of the nucleus. Figure 5.1 
contains scatter plots that show the mean volume of each CT (on the horizontal 
axis) plotted against the distance between the center of that CT and the nuclear 
center (on the vertical axis), for a medium-sized nucleus (volume of 1000 /im^) and 
for both spherical and flat ellipsoidal shapes. (The scatter plots for the large and 
small volume nuclei are similar, so we do not show them here.) A least-squares 
regression line is also shown. In both graphs, a negative trend is detectable, meaning 
that the larger ellipsoids tend to lie closer to the nuclear center, while the smaller 
ones prefer peripheral positions. This is the opposite trend to the one observed in 
nature, suggesting that the minimum-overlap criterion alone is insufficient to explain 
the experimental results. 
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Chromosome 



Chromosome 



(a) Spherical Nucluei 



(b) Ellipsoidal Nuclei 



Fig. 5.1. Scatter plots and regression lines for distances of ellipsoidal chromosome territories 
to nucleus center, for medium-sized nuclei. Horizontal axis is CT volume, vertical axis is distance 
to nucleus center. 



In Table [5T3| we report the slopes of the regression line for all six scenarios. Inter- 
estingly, the negative trend is consistently weaker for flat ellipsoidal nuclei compared 
to spherical nuclei. Experimentalists report a preference for larger CTs to be on the 
periphery for ellipsoidal nuclei, while for spherical nuclei, the radial preferences are 
believed to be correlated with gene density. 

Table 5.3 

Slope of regression line for all six scenarios. 





small 


medium 


large 


spherical 


-0.0050562 


-0.0029405 


-0.0020672 


ellipsoidal 


-0.0047345 


-0.0025045 


-0.0018849 



5.4. Radial preferences assuming heterologous CT groupings. Khalil et 
al. [9j showed that CTs tend to assemble in heterologous neighborhoods, causing the 
distances between homologous chromosome pairs to be larger in general than het- 
erologous inter-CT distances. They discuss a number of possible explanations for this 
phenomenon, such as that heterologous neighborhoods act as a buffer zone in prevent- 
ing inter-homologue recombination and protect against the loss of heterozygosity. The 
authors also analyze whether the radial preferences discussed in the previous subsec- 
tion could explain the preference for arrangements with larger homologous inter-CT 
distances. Using simulations, they give a negative answer to this question. 

In the following analysis, we invert the question, asking instead whether the 
preference for heterologous neighborhoods can explain the observed radial preferences. 
To investigate this question, we add penalties to our model to discourage the CTs in a 
homologous pair from being too close to each other. We solve the resulting formulation 
using a combination of Algorithm[l]for sphere packing with Algorithm 4.4 for ellipsoid 
packing. 

We denote the set of index pairs («, j) corresponding to homologous chromosome 
pairs by H and we introduce a new variable rj to capture the proximity of CTs in a 
homologous pair. Specifically, we define for each ellipsoid i an enclosing sphere that 
is concentric with the ellipsoid i, with radius A times the maximum semi- axis length 
ri of the CT, where A > 1 is a user-defined parameter. We define rj to be the maximal 
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overlap of these enclosing spheres, over all homologous pairs, by adding constraints 

then add a penalty term crj to the objective (where 
to obtain the following extension of formulation 



whose form is similar to (3.4b ). We then add a penalty term crj to the objective (where 
c > is some penalty parameter). 



(4.5) 



min £ + cTi 

^,?),(ci,Si,Si),i=l,2,...,JV 

subject to ^ > 0(ci, Cj, Sj, Ej), 

-"T]) - \\Ci - Cj\\2 < rj, 

Si = S'f , 

semi-axes oi Ei are r^i, ri2, J'is, 
We can relax this to obtain an extended formulation of 



I <i < i < N, 
i = l,2,...,7V, 
i = l,2,...,7V. 



(5.1a) 

(5.1b) 
(5.1c) 
(5.1d) 
(5.1e) 
(5.1f) 



4.9). To solve, we extend 



Algorithm 4.4 by adding linearizations of the constraints (5.1b I to each subproblem 



in the manner of (3.6b I 



For our simulations, we choose c = 100 and A — 1.25. As in Subsection |5.3[ we 
generated about 100-200 trials by perturbing CT volumes and dimensions randomly 
around given mean values and using different random starting points. The screening 
procedure described in the previous subsection was applied to remove those trials 
with less competitive final objective values. Statistics for the final objectives are 
shown in Table |5.4[ The large objective values in the first line of the table indicates 
that for small spherical nuclei, it was not possible to find solutions in which the 
honiolog separation was enforced adequately. (The only trial that survived screening 
was one that violated these conditions significantly less than most others.) Among 
the other scenarios, only the medium spherical nucleus saw significant numbers of 
trials removed by screening. Here, most of the trials attained final objectives quite 
close to 1.90, while the others had significantly higher values. In the other four 
scenarios — small ellipsoidal, medium ellipsoidal, large spherical, and large ellipsoidal 
— proximity penalties for homologous pairs were not incurred, and final objective 
values were tightly clustered. 

The convex relaxation of our problem that encourages separation of homologous 
CT pairs does less well in preserving the dimensions of the ellipsoids than the formu- 
lation considered in Section |5.3| For spherical nuclei 26 out of the 46 CTs for small 
nuclei experienced a relative change in the half-axes lengths of more than 10%. For 
the medium spherical nuclei it was in average 11 out of 46 and for the large spherical 



Table 5.4 

Statistics for final objective values attained in the six scenarios in which homolog proximity 
is penalized, showing number of trials, means, and standard deviations, both before and after the 
screening step. 







Before Screening 


After Screening 


shape 


vol (nm^) 


trials 


mean 


sd 


trials 


mean 


sd 


spherical 


500 


100 


294.0617 


46.8185 


1 


184.0292 


0.0000 


ellipsoidal 


500 


100 


3.6556 


0.2691 


95 


3.5987 


0.0879 


spherical 


1000 


200 


15.0885 


20.7260 


114 


1.8993 


0.0833 


ellipsoidal 


1000 


200 


2.0088 


0.5118 


192 


1.9060 


0.0691 


spherical 


1600 


100 


0.4424 


1.2293 


94 


0.1369 


0.0213 


ellipsoidal 


1600 


100 


0.2752 


0.8097 


97 


0.1343 


0.0251 
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21 Y IB 16 15 13 12 B X 6 5 4 3 2 21 Y IB 16 15 13 12 B X 6 5 4 3 2 



Chromosome Chromosome 
(a) Spherical Nuclei (b) Ellipsoidal Nuclei 

Fig. 5.2. Scatter plots and regression lines for ellipsoidal chromosome territory distances to 
nucleus center, where penalties to enforce heterologous groupings are present in the objective. Hor- 
izontal axis is CT volume, vertical axis is distance to nucleus center. 



nuclei 17 out of 46. The statistics for the ehipsoidal nuclei were somewhat smaller: 12 
for the small nuclei, 4 for the medium nuclei, and 9 for the large nuclei. On the small 
nuclei, the distortions can be explained by the tightness of space, while on large nu- 
clei, the fact that all CTs can be fit without any overlap reduces the need for them to 
adopt their lowest-volume dimensions (which would achieve the prescribed semi-axis 
lengths) . 



Figure 5.2 contains scatter plots showing the mean volume of each CT (on the 
horizontal axis) plotted against the distance between the center of that CT and the 
nuclear center (on the vertical axis), for a medium-sized nucleus (volume of 1000 

scatter 



5.3 



ixm^) and for both spherical and flat ellipsoidal shapes. (As in SubsectionL 
plots for the large and small volume nuclei are similar, so we do not show them 
here.) Here the regression line shows a significant positive trend, meaning that the 
smaller ellipsoids tend to lie in the interior of the nucleus, while the larger ones prefer 
peripheral positions. Hence, by adding penalties on nearness of homologous pairs to 
the formulation, we are able to match the radial preferences observed in nature. 



Another interesting observation, more evident in Figure 5.2 a), is that the X and Y 
chromosomes both lie closer to the nucleus center than their size would suggest. This 
makes sense, as these are the only two chromosomes not subject to the homologous- 
pair separation penalties. 



In Table 5.5 we report the slopes of the regression line for all six size / shape 
scenarios considered in this section. These results highlight a significant difference 
between spherical and flat-ellipsoidal nuclei. The radial preference is consistently 
weaker for spherical nuclei than in flat-ellipsoidal nuclei. 

Table 5.5 

Slope of regression line for all six scenarios assuming heterologous CT groupings. 





small 


medium 


large 


spherical 
ellipsoidal 


0.0031803 
0.0081768 


0.0078299 
0.0102088 


0.0072028 
0.0145001 



6. Discussion. We have described a bilevel optimization procedure for finding 
local solutions of the problem of packing spheres and ellipsoids in finite volumes, and 
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used these procedures to model and analyze chromosome arrangement in cell nuclei. 
Semidefinite programming duality is used to obtain the sensitivity information needed 
to construct the approximation to the upper-level problem that is solved at each iter- 
ation of the trust-region procedure. Our convergence analysis takes place in a general 
setting in which the lower-level problems are semidefinite programs parametrized by 
their objective coefficient matrix; it is not confined to the specific form of the semidef- 
inite programs arising from the S-procedure for overlapping ellipsoids. Thus it may 
be adaptable to other design problems involving parametrized systems that can be 
modeled by semidefinite programs. 

In the CT packing application discussed in Section [5] we initially found that the 
arrangements observed experimentally could not be explained by the simple geomet- 
ric principle of minimizing the maximum overlap. However, when we enhanced the 
model to capture the recently observed phenomenon of heterologous neighborhoods / 
homologous pair separation, the radial preferences observed in nature (in which larger 
CTs tended to lie further from the nuclear center) were recovered in our simulations. 
The homologous-pair-separation aspects of our model are governed by two positive 
parameters c and A; we reported results in Subsection |5.4| only for the values c = 100 
and A = 1.25. From an examination of Tables [573| and [575} we speculate that it would 
be possible to choose these parameters in such a way that the slope of the regression 
line for spherical nuclei would be approximately zero, while the corresponding slope 
for ellipsoidal nuclei would be positive. Such a result would be consistent with exper- 
imental observations that identify no clear radial preference for spherical nuclei, but 
a pronounced radial preference for ellipsoidal nuclei. 

We obtained results on a limited but representative range of nuclei dimensions. In 
future work, we will explore CT configurations for a wider range of ellipsoidal shapes 
and sizes, corresponding to known dimensions of nuclei in different cell types. We 
will also enhance the model as further biological results are obtained, aiming to find 
biologically plausible, elementary principles that explain experimental observations 
(in the spirit of Occam's Razor). 

Appendix A. Technical Results for Parametrized Semidefinite Pro- 
grams. 

In this section we consider the following primal-dual pair of semidefinite programs 
that are parametrized by the primal objective term C: 

min(C,X) s.t. (Ai,X)=6„ i = l,2, X ^ 0, (A.l) 



max b C 



s.t. 



(A.2) 



We denote solutions of these problems by X{C) and (C(C), S{C)), respectively. (Our 
interest is in the application to the SDP pair (4.13), ( |4.17 ), but we have simplified 
the notation here.) 



We show first that the solutions to ( A.l ) are uniformly bounded in a neighborhood 
of a C for which a strictly feasible point for the dual (A.2) exists. The result is an 
almost immediate consequence of [171 Theorem 4.1]. 

Lemma A.l. Consid er the primal-dual pair {A.l), (A. 2 ) of semidefinite pro- 
grams: Suppose that 



A.l) is feasible (with feasible point X ) , and that at some matrix 



Co e S'R"''" 

ues of S are bounded below by a > 



there exists a .strictly feasible point {C, S) for {A.2), where the eigenval- 

Then there exists a constant 5 > such that 
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for all matrices C G 5R"^" with ||C — Co\\f < S, (A.l) has a nonempty solution set, 
and all solutions X{C) are hounded as follows: 



\\X{C)\\f < -{{X,S)+S\\X\\f). 
a 



Moreover the optimal values of the problems {A.l) and HTM) are equal 



Proof. We have for any X feasible for the primal (note that the primal feasible 
region does not depend on C) that 

{C,X)^{Co,X) + {C~Co,X) 
p 

= {Y,QA + S,X) + {C~Co,X) 
i=l 

= b^C + {S,X) + {C-Co,X). 

Since is independent of X, we can obtain an equivalent to the primal problem by 
replac ing i ts objective {C,X) by {S + C — Co,X). Using the assumed feasible point 
X of (A.l) (note that there is no dependence of X on C), we can formulate (A.l I 
equivalently as follows: 



min(S' + C-Co,A:) s.t. {A,,X)=b,, i = l,2,...,p, X^O, 



{S + C-Co,X) < {S + C -Cq,X). 



(A.3a) 
(A.3b) 



By choosing 6 G (0,ct/2], we have that all eigenvalues of 5 + C — Cq are bounded 
below by a/2. Hence from "Fact 14" of |T7], we have that 

{S+C-Co,X) < (S+C~Co,X) \\X\\f <-{S+C-Co,X) <-{{S,X)+5\\X\\f). 

a (T 



Hence, (A.3) involves the minimization of a continuous function over a nonempty 



compact set, so the solution set exists, and moreover, all solutions are bounded as 
claimed. 

The last claim can be derived exactly as in [ITl Theorem 4.1]. □ 
The next result examines the solution of a sequence of parametrized SDPs. 
Theo rem A. 2. Given Ai, i = 1 = 1, 2, . . . ,p and b € as in Lemma A.L such 
that {A.l ) is feasible, let C £ iSR"^" be such that there exists a strictly feasible point 
for (A. 2) when C = G . Gonsider a sequence {Ck\ with Gk G S'R"^" and Gk — >■ C. 
Then the following is true: 



(i) There is a constant /3 > and index K such that {A.l) with G = Gk has 
nonempty solution set for all k > K, and \\X(Gk)\\F ^ P for all such solu- 
tions. 

(a) If {X{Gk)} is a sequence of solutions of {A.l^ with G = Gk, then this sequence 
has at least one accumulation point, and all such accumulation points are 
solutions of {A.l) with G — G. 



Proof. The first claim (i) is an immediate consequence of Lemma A.l For (ii), 
note that boundedness of X(Gk) ensures existence of accumulation points. Suppose 
that X is s uch a point and assume WLOG that X{Gk) ~^ X. N ote first that X is 
feasible for (A.l I regardless of C. If X were not optimal for (A.l) with G — C, then 



there would exist another feasible matrix X with {G,X) < {G,X). But since 



lim {Gk,X) 

k 



{C,X) < {C,X) ^\im{Gk,X(Gk)), 

k 
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we have that {Ck,X) < {Ck, X{Ck)) for aU k sufficiently large, contradicting optimal- 
ity of X{Gk). Hence (ii) is true. □ 



We next prove some elementary and useful facts about the value function of ( A.l ) 
which we denote by t{C). 



Lemma A. 3. Suppose that (A.l) is feasible, and let C € S'R"^" he such that there 
exists a strictly feasible point for {A. 2) when C = C . Then there exists a neighborhood 



M of C within which the following claims are true, 
(i) t(-) is a concave function, 
(ii) For all C £ M and all AC G S'R"^", we have 

t{C + AC) < t(C) + (X(C), AC), 

where X{C) is any solution of {A.l^ 



(A.4) 



(Hi) Any X{C) that solves 
(iv) t{-) is Lipschitz continuous in M 
Proof. The proof of (i) is elementary. 



A.l) belongs to the Clarke subdifferential of t{-) at C . 



For (ii), note first from Lemma A.l that we can choose N so as to ensure that a 
solution X{C) to (A.l) exists for all C G A/". We have (denoting by the feasible set 
for ( |A.ip ) that 



t{C + AC) 



min(C + AC,A) < (C + AC,A(C)) = t(C) + (AC, X(C)), 



as required. 

For (iii), note that (by taking the negative of the inequality (A.4)) —X{C) is a 
subgradient of the convex function (— i)(C), and so ~X{C) belongs to the Clarke 
subdifferential of {-t){C). It follows from p. 128, Exercise 8(c)] (with A = -1) 
that X{C) belongs to the Clarke subdifferential of i(C), as claimed. 

For (iv), note from Lemma A.l that we can choose M such that ||A(C)|| is 
uniformly bounded for all C G A/" (by /3 > 0, say) 
two points in A/", we have from (A.4 1 that 



Denoting by Ci and C2 any 



<(C2) <t(Ci) + (A(Ci),C2-Ci), 

t(Ci) <t(C2) + (A(C2),Ci-C2). 



Thus 



|i(Ci) - t{C2)\ < max(||A(Ci)||, ||A(C2)||) ||Ci - C2II < P\\Ci - C2I 
proving the Lipschitz property. □ 
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